Search CORE

5 research outputs found

FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier Convolutions

Author: Berenguel-Baeta Bruno
Bermudez-Cameo Jesus
Guerrero Jose J.
Publication venue
Publication date: 04/10/2022
Field of study

In this work we present FreDSNet, a deep learning solution which obtains semantic 3D understanding of indoor environments from single panoramas. Omnidirectional images reveal task-specific advantages when addressing scene understanding problems due to the 360-degree contextual information about the entire environment they provide. However, the inherent characteristics of the omnidirectional images add additional problems to obtain an accurate detection and segmentation of objects or a good depth estimation. To overcome these problems, we exploit convolutions in the frequential domain obtaining a wider receptive field in each convolutional layer. These convolutions allow to leverage the whole context information from omnidirectional images. FreDSNet is the first network that jointly provides monocular depth estimation and semantic segmentation from a single panoramic image exploiting fast Fourier convolutions. Our experiments show that FreDSNet has similar performance as specific state of the art methods for semantic segmentation and depth estimation. FreDSNet code is publicly available in https://github.com/Sbrunoberenguel/FreDSNetComment: 7 pages, 5 figures, 3 table

arXiv.org e-Print Archive

OmniSCV: An omnidirectional synthetic image generator for computer vision

Author: Berenguel-Baeta Bruno
Bermudez-Cameo Jesús
Guerrero José J.
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Omnidirectional and 360º images are becoming widespread in industry and in consumer society, causing omnidirectional computer vision to gain attention. Their wide field of view allows the gathering of a great amount of information about the environment from only an image. However, the distortion of these images requires the development of specific algorithms for their treatment and interpretation. Moreover, a high number of images is essential for the correct training of computer vision algorithms based on learning. In this paper, we present a tool for generating datasets of omnidirectional images with semantic and depth information. These images are synthesized from a set of captures that are acquired in a realistic virtual environment for Unreal Engine 4 through an interface plugin. We gather a variety of well-known projection models such as equirectangular and cylindrical panoramas, different fish-eye lenses, catadioptric systems, and empiric models. Furthermore, we include in our tool photorealistic non-central-projection systems as non-central panoramas and non-central catadioptric systems. As far as we know, this is the first reported tool for generating photorealistic non-central images in the literature. Moreover, since the omnidirectional images are made virtually, we provide pixel-wise information about semantics and depth as well as perfect knowledge of the calibration parameters of the cameras. This allows the creation of ground-truth information with pixel precision for training learning algorithms and testing 3D vision approaches. To validate the proposed tool, different computer vision algorithms are tested as line extractions from dioptric and catadioptric central images, 3D Layout recovery and SLAM using equirectangular panoramas, and 3D reconstruction from non-central panoramas

Multidisciplinary Digital Publishing Institute

Repositorio Universidad de Zaragoza

Omnidirectional Image Data-Set for Computer Vision Applications

Author: Berenguel-Baeta Bruno
Bermudez-Cameo Jesus
Guerrero Jose Jesus
Publication venue: 'Universidad de Zaragoza'
Publication date: 22/12/2020
Field of study

In this paper we present an image data-set of different omnidirectional systems. The images include full information of colour, depth, instance segmentation and room layout. This dataset aims to help in the training and test of different neural networks and development of computer vision algorithms

Universidad Zaragoza: Open Journal Systems

Simulador de imágenes omnidireccionales fotorealistas para visión por computador

Author: Berenguel Baeta Samuel Bruno
Bermúdez Cameo Jesús
Guerrero Campo José Jesús
Publication venue: 'Universidad de Zaragoza'
Publication date: 01/01/2019
Field of study

La motivación de este proyecto es la necesidad de bases de imágenes omnidireccionales y panorámicas para visión por computador. Su elevado campo de visión permite obtener una gran cantidad de información del entorno a partir de una única imagen. Sin embargo, la distorsión propia de estas imágenes requiere desarrollar algoritmos específicos para su tratamiento e interpretación. Además, un elevado número de imágenes es imprescindible para el correcto entrenamiento de algoritmos de visión por computador basados en aprendizaje profundo. La adquisición, etiquetado y preparación de estas imágenes de forma manual con sistemas reales requiere una cantidad de tiempo y volumen de trabajo que en la práctica limita el tamaño de estas bases de datos. En este trabajo se propone la implementación de una herramienta que permita generar imágenes omnidireccionales sintéticas fotorrealistas que automatice la generación y el etiquetado como estrategia para aumentar el tamaño de estas bases de datos. Este trabajo se apoya en los entornos virtuales que se pueden crear con el motor de videojuegos Unreal Engine 4, el cual se utiliza junto a uno de sus plugin, UnrealCV. A partir de estos entornos virtuales se construyen imágenes de una variedad de cámaras omnidireccionales y 360º con calidad fotorrealista. Las características del entorno permiten además generar imágenes de profundidad y semánticas. Al hacerse todo de forma virtual, se pueden controlar los parámetros de adquisición de la cámara y las características del entorno, permitiendo construir una base de datos con un etiquetado automático sin supervisión. Conocidos los parámetros de calibración, posición y orientación de la cámara y la distribución del entorno y sus objetos, se puede conseguir el ground truth para diversos algoritmos de visión. Con las imágenes e información que se dispone, se pueden evaluar algoritmos de extracción de rectas en imágenes dióptricas y catadióptricas, obtención de layouts en panoramas o métodos de reconstrucción 3D como la localización y mapeado simultáneos (SLAM).<br /

Repositorio Universidad de Zaragoza

Detección y segmentación de objetos en imágenes panorámicas

Author: Berenguel Baeta Samuel Bruno
de Nova Guerrero Alejandro
Guerrero Campo José Jesús
Publication venue: 'Universidad de Zaragoza'
Publication date: 01/01/2021
Field of study

En este trabajo se ha programado 'from scratch' una red neuronal convolucional para detección y segmentación semántica de objetos en imágenes panorámicas de interior. Se ha basado en la arquitectura utilizada en la red neuronal BlitzNet desarrollada para el uso de imágenes convencionales, pero adaptándola para su aplicación directa a imágenes panorámicas, haciendo uso de convoluciones equirectangulares en lugar de convoluciones estándar para lidiar con la distorsión presente en las imágenes panorámicas.Las ramas de detección y segmentación semántica comparten gran parte de la estructura de la red neuronal, lo que facilita el aprendizaje conjunto de ambas tareas. Además, la arquitectura propuesta permite realizar ambas tareas a diferentes escalas, pudiendo detectar y segmentar objetos de diversos tamaños.El modelo se ha implementado en Python utilizando Tensorflow 2.x, una librería específica de deep learning más eficiente y sencilla que su versión anterior, Tensorflow 1.x, en la que está programada la red neuronal original de BlitzNet.<br /

Repositorio Universidad de Zaragoza